-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: get Xiaohongshu fulltext #17075
Conversation
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
This comment was marked as off-topic.
This comment was marked as off-topic.
需要自建 |
好的,我拉下你的代码试下 |
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
代码似乎还有点问题,我拉你仓库改了下,能用了。 |
会合并吗 @DIYgod |
8ef9ae7
to
d224d67
Compare
async function getUser(url, cookie) { | ||
const res = await got(url, { | ||
headers: { | ||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSSHub will use a randomised user agent of Chrome on mac by default. Does the site only work with this fixed version of Chrome on Windows?
const data = (await cache.tryGet(link, async () => { | ||
const res = await got(link, { | ||
headers: { | ||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RSSHub will use a randomised user agent of Chrome on mac by default. Does the site only works with this fixed version of Chrome on Windows?
const res = await got(url, { | ||
headers: { | ||
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36', | ||
Cookie: cookie, | ||
}, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This request does not require cookie.
let script = $('script') | ||
.filter((i, script) => { | ||
const text = script.children[0]?.data; | ||
return text?.startsWith('window.__INITIAL_STATE__='); | ||
}) | ||
.text(); | ||
script = script.slice('window.__INITIAL_STATE__='.length); | ||
script = script.replaceAll('undefined', 'null'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let script = $('script') | |
.filter((i, script) => { | |
const text = script.children[0]?.data; | |
return text?.startsWith('window.__INITIAL_STATE__='); | |
}) | |
.text(); | |
script = script.slice('window.__INITIAL_STATE__='.length); | |
script = script.replaceAll('undefined', 'null'); | |
const script = $("script:contains('__INITIAL_STATE__')") | |
.text() | |
.match(/window\.__INITIAL_STATE__=(.*)/)?.[1] | |
?.replaceAll('undefined', 'null'); |
async function renderNotesFulltext(notes, url) { | ||
const data: any[] = []; | ||
const promises = notes.flatMap((note) => | ||
note.map(async ({ noteCard }) => { | ||
const link = `${url}/${noteCard.noteId}`; | ||
const { title, description, pubDate } = await getFullNote(link); | ||
return { | ||
title, | ||
link, | ||
description, | ||
author: noteCard.user.nickName, | ||
guid: noteCard.noteId, | ||
pubDate, | ||
}; | ||
}) | ||
); | ||
data.push(...(await Promise.all(promises))); | ||
return data; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove unnecessary spreading and push
async function renderNotesFulltext(notes, url) { | |
const data: any[] = []; | |
const promises = notes.flatMap((note) => | |
note.map(async ({ noteCard }) => { | |
const link = `${url}/${noteCard.noteId}`; | |
const { title, description, pubDate } = await getFullNote(link); | |
return { | |
title, | |
link, | |
description, | |
author: noteCard.user.nickName, | |
guid: noteCard.noteId, | |
pubDate, | |
}; | |
}) | |
); | |
data.push(...(await Promise.all(promises))); | |
return data; | |
} | |
function renderNotesFulltext(notes, url) { | |
const promises = notes.flatMap((note) => | |
note.map(async ({ noteCard }) => { | |
const link = `${url}/${noteCard.noteId}`; | |
const { title, description, pubDate } = await getFullNote(link); | |
return { | |
title, | |
link, | |
description, | |
author: noteCard.user.nickName, | |
guid: noteCard.noteId, | |
pubDate, | |
}; | |
}) | |
); | |
return Promise.all(promises); | |
} |
let script = $('script') | ||
.filter((i, script) => { | ||
const text = script.children[0]?.data; | ||
return text?.startsWith('window.__INITIAL_STATE__='); | ||
}) | ||
.text(); | ||
script = script.slice('window.__INITIAL_STATE__='.length); | ||
script = script.replaceAll('undefined', 'null'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
Successfully generated as following: http://localhost:1200/xiaohongshu/user/5b55f1534eacab0302da1f02/notes/fulltext - Failed ❌
|
Involved Issue / 该 PR 相关 Issue
Close #16300
Example for the Proposed Route(s) / 路由地址示例
New RSS Route Checklist / 新 RSS 路由检查表
Puppeteer
Note / 说明
Need to add
XIAOHONGSHU_COOKIE